Natjam: Eviction Policies For Supporting Priorities and Deadlines in Mapreduce Clusters

نویسندگان

  • Brian Cho
  • Muntasir Rahman
  • Tej Chajed
  • Indranil Gupta
  • Cristina Abad
  • Nathan Roberts
  • Philbert Lin
چکیده

This paper presents Natjam, a system that supports arbitrary job priorities, hard real-time scheduling, and efficient preemption for Mapreduce clusters that are resource-constrained. Our contributions include: i) smart eviction policies for jobs and for tasks, based on resource usage, task runtime, and job deadlines; and ii) a work-conserving task preemption mechanism. We incorporated Natjam into the Hadoop YARN scheduler framework (in Hadoop 0.23). We present experiments from deployments on a test cluster, Emulab and a Yahoo! commercial cluster, using both synthetic traces as well as Hadoop cluster traces we obtained from Yahoo!. Our results reveal that Natjam incurs overheads of under 7%. Under real Hadoop workloads, Natjam performs better than existing techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Research Statement -muntasir Raihan Rahman

My research goal is to build adaptive big data and cloud systems that can meet a spectrum of user requirements expressed using service level agreements (SLA) and service level objectives (SLO). During my PhD, I have worked on several angles of tracking and enforcing SLA/SLO guarantees in cloud systems, including in Mapreduce clusters, and NoSQL key-value storage systems. My future research goal...

متن کامل

Application profiling and resource management for MapReduce

Application profiling and resourcemanagement forMapReduce Scale of data generated and processed is exponential growth in the Big Data ear. It poses a challenge that is far beyond the goal of a single computing system. Processing such vast amount of data on a single machine is impracticable in term of time or cost. Hence, distributed systems, which can harness very large clusters of commodity co...

متن کامل

PASS: Power-Aware Scheduling of Mixed Applications with Deadline Constraints on Clusters

Reducing energy consumption has become a pressing issue in cluster computing systems not only for minimizing electricity cost, but also for improving system reliability. Therefore, it is highly desirable to design energy-efficient scheduling algorithms for applications running on clusters. In this paper, we address the problem of non-preemptively scheduling mixed tasks on power-aware clusters. ...

متن کامل

Network-Aware Task Assignment for MapReduce Applications in Shared Clusters

Running MapReduce applications in shared clusters is becoming increasingly compelling to improve the cluster utilization. However, the network sharing across diverse applications can make the network bandwidth for MapReduce applications constrained and heterogeneous, which inevitably increases the severity of network hotspots in racks, and makes the existing task assignment policies that focus ...

متن کامل

Scather: programming with multi-party computation and MapReduce

We present a prototype of a distributed computational infrastructure, an associated highlevel programming language, and an underlying formal framework that allow multiple parties to leverage their own cloud-based computational resources (capable of supporting MapReduce [27] operations) in concert with multi-party computation (MPC) to execute statistical analysis algorithms that have privacy-pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013